Self-Organizing-Map-Based Metamodeling for Massive Text Data Exploration

نویسندگان

  • Kin Keung Lai
  • Lean Yu
  • Ligang Zhou
  • Shouyang Wang
چکیده

In this study, we describe the use of the self-organizing map (SOM) as a metamodeling technique to design a parallel text data exploration system. Firstly, the large textual collections are divided into various small data subsets. Based on the different subsets, different unitary SOM models, i.e., base models, are then trained for word clustering map. In this phase, different SOM models are implemented in parallel to gain greater computational efficiency. Finally, a SOM-based metamodel can be produced to formulate a text category map through learning from all base models. For illustration the proposed metamodel is applied to a massive text data collection.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Data Mining

Classiication is one of the central issues in any system dealing with text data. The need for eeective approaches is dramatically increased nowadays due to the advent of massive digital libraries containing free-form documents. What we are looking for are powerful methods for the exploration of such libraries whereby the discovery of similarities between groups of text documents is the overall ...

متن کامل

Exploration of Document Collections with Self-Organizing Maps: A Novel Approach to Similarity Representation

Classiication is one of the central issues in any system dealing with text data. The need for eeective approaches is dramatically increased nowadays due to the advent of massive digital libraries containing free-form documents. What we are looking for are powerful methods for the exploration of such libraries whereby the detection of similarities between the various text documents is the overal...

متن کامل

Finding structure in text archives

With the advance and massive growth of electronic text archives, the need for tools emerges, which help to gain insight into the basic structure of the underlying digital library. We present a neural network approach for the analysis and exploration of text archives aiming at the detection and visualization of the inherent structure of the text collection. This cluster visualization technique c...

متن کامل

Exploration of Text Collections with Hierarchical Feature

Document classiication is one of the central issues in information retrieval research. The aim is to uncover similarities between text documents. In other words, classiication techniques are used to gain insight in the structure of the various data items contained in the text archive. In this paper we show the results from using a hierarchy of self-organizing maps to perform the text classiicat...

متن کامل

Exploration of Full-text Databases with Self-organizing Maps

Availability of large full-text document collections in electronic form has created a need for intelligent information retrieval techniques. Especially the expanding World Wide Web presupposes methods for systematic exploration of miscellaneous document collections. In this paper we introduce a new method, the WEBSOM, for this task. Self-Organizing Maps (SOMs) are used to represent documents on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006